Non-negative Hidden Markov Modeling of Audio with Application to Source Separation
نویسندگان
چکیده
In recent years, there has been a great deal of work in modeling audio using non-negative matrix factorization and its probabilistic counterparts as they yield rich models that are very useful for source separation and automatic music transcription. Given a sound source, these algorithms learn a dictionary of spectral vectors to best explain it. This dictionary is however learned in a manner that disregards a very important aspect of sound, its temporal structure. We propose a novel algorithm, the non-negative hidden Markov model (N-HMM), that extends the aforementioned models by jointly learning several small spectral dictionaries as well as a Markov chain that describes the structure of changes between these dictionaries. We also extend this algorithm to the non-negative factorial hidden Markov model (N-FHMM) to model sound mixtures, and demonstrate that it yields superior performance in single channel source separation tasks.
منابع مشابه
Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation
The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduce...
متن کاملAudio Source Separation Using Hierarchical Phase-Invariant Models
Audio source separation consists of analyzing a given audio recording so as to estimate the signal produced by each sound source for listening or information retrieval purposes. In the last five years, algorithms based on hierarchical phase-invariant models such as singleor multichannel hidden Markov models (HMMs) or nonnegative matrix factorization (NMF) have become popular. In this paper, we ...
متن کاملSound Event Detection in Multisource Environments Using Source Separation
This paper proposes a sound event detection system for natural multisource environments, using a sound source separation front-end. The recognizer aims at detecting sound events from various everyday contexts. The audio is preprocessed using non-negative matrix factorization and separated into four individual signals. Each sound event class is represented by a Hidden Markov Model trained using ...
متن کاملVocal-tract Modeling for Speaker Independent Single Channel Source Separation
In this paper, we investigate two statistical models for the source-filter based single channel speech separation task. We incorporate source-driven aspects by pitch estimation in the model-driven method which models the vocal-tract part as a priori knowledge. This approach results in a speaker independent (SI) source separation method. For modeling the vocal tract filters Gaussian mixture mode...
متن کاملSource-Filter-Based Single-Channel Speech Separation Using Pitch Information
In this paper, we investigate the source–filter-based approach for single-channel speech separation. We incorporate source-driven aspects by multi-pitch estimation in the model-driven method. For multi-pitch estimation, the factorial HMM is utilized. For modeling the vocal tract filters either vector quantization (VQ) or non-negative matrix factorization are considered. For both methods, the fi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010